1,540 research outputs found

    Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis

    Full text link
    This paper proposes a forward attention method for the sequenceto- sequence acoustic modeling of speech synthesis. This method is motivated by the nature of the monotonic alignment from phone sequences to acoustic sequences. Only the alignment paths that satisfy the monotonic condition are taken into consideration at each decoder timestep. The modified attention probabilities at each timestep are computed recursively using a forward algorithm. A transition agent for forward attention is further proposed, which helps the attention mechanism to make decisions whether to move forward or stay at each decoder timestep. Experimental results show that the proposed forward attention method achieves faster convergence speed and higher stability than the baseline attention method. Besides, the method of forward attention with transition agent can also help improve the naturalness of synthetic speech and control the speed of synthetic speech effectively.Comment: 5 pages, 3 figures, 2 tables. Published in IEEE International Conference on Acoustics, Speech and Signal Processing 2018 (ICASSP2018

    How to interpret a discovery or null result of the 0ν2β0\nu 2\beta decay

    Get PDF
    The Majorana nature of massive neutrinos will be crucially probed in the next-generation experiments of the neutrinoless double-beta (0ν2β0\nu 2\beta) decay. The effective mass term of this process, mee\langle m\rangle^{}_{ee}, may be contaminated by new physics. So how to interpret a discovery or null result of the 0ν2β0\nu 2\beta decay in the foreseeable future is highly nontrivial. In this paper we introduce a novel three-dimensional description of mee|\langle m\rangle_{ee}^{}|, which allows us to see its sensitivity to the lightest neutrino mass and two Majorana phases in a transparent way. We take a look at to what extent the free parameters of mee|\langle m\rangle_{ee}^{}| can be well constrained provided a signal of the 0ν2β0\nu 2\beta decay is observed someday. To fully explore lepton number violation, all the six effective Majorana mass terms mαβ\langle m\rangle_{\alpha\beta}^{} (for α,β=e,μ,τ\alpha, \beta = e, \mu, \tau) are calculated and their lower bounds are illustrated with the two-dimensional contour figures. The effect of possible new physics on the 0ν2β0\nu 2\beta decay is also discussed in a model-independent way. We find that the result of mee|\langle m\rangle_{ee}^{}| in the normal (or inverted) neutrino mass ordering case modified by the new physics effect may somewhat mimic that in the inverted (or normal) mass ordering case in the standard three-flavor scheme. Hence a proper interpretation of a discovery or null result of the 0ν2β0\nu 2\beta decay may demand extra information from some other measurements.Comment: 13 pages, 6 figures, Figures and references update

    Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses

    Full text link
    This paper presents a novel speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra by neural networks. The proposed model is a cascade of a residual convolutional network and a parallel estimation architecture. The parallel estimation architecture is composed of two parallel linear convolutional layers and a phase calculation formula, imitating the process of calculating the phase spectra from the real and imaginary parts of complex spectra and strictly restricting the predicted phase values to the principal value interval. To avoid the error expansion issue caused by phase wrapping, we design anti-wrapping training losses defined between the predicted wrapped phase spectra and natural ones by activating the instantaneous phase error, group delay error and instantaneous angular frequency error using an anti-wrapping function. Experimental results show that our proposed neural speech phase prediction model outperforms the iterative Griffin-Lim algorithm and other neural network-based method, in terms of both reconstructed speech quality and generation speed.Comment: Accepted by ICASSP 2023. Codes are availabl
    corecore